频域增强自适应通道注意力与特征金字塔融合的可逆流网络风格迁移

doi:10.16451/j.cnki.issn1003-6059.202601005

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (5237 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对风格迁移中内容失真、出现伪影与频域特性利用不足的问题,文中提出基于频域增强自适应通道注意力与特征金字塔融合的可逆流网络风格迁移方法.首先,以预训练VGG19为基础架构,设计可逆流网络,利用无偏差特征传递机制减少特征损耗,保障内容结构的完整性.然后,设计频域增强自适应通道注意力模块,解析风格图像频域分布,建立内容特征与风格特征的精准关联,提升风格化效果.最后,构建特征金字塔融合模块,对齐全局风格与局部纹理,提升迁移结果协调性.在MS-COCO、WikiArt数据集上的实验表明,文中方法能平衡风格传递与内容保留,在内容结构完整性、风格化效果及计算效率上均较优.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	葛斌
	邵孜绎
	郑军帅
	夏晨星
	官骏鸣
	徐涛

关键词 ：计算机视觉, 深度学习, 风格迁移, 自注意力机制

Abstract：To address the issues of content distortion, artifact appearance, and insufficient utilization of frequency-domain characteristics in style transfer, a reversible flow network for style transfer based on frequency-domain enhanced adaptive channel attention and feature pyramid fusion(FECANet) is proposed. Based on the pre-trained VGG19 architecture, a reversible flow network is designed to reduce feature loss and ensure the integrity of content structure by leveraging its unbiased feature transfer mechanism. A frequency-domain enhanced adaptive channel attention module is developed to analyze the frequency-domain distribution of style images, and accurate correlations between content and style features are established to improve the stylization effect. Additionally, a feature pyramid fusion scheme is designed to align global style with local textures, enhancing the coordination of transfer results. Experiments on MS-COCO and WikiArt datasets show that FECANet effectively balances style transfer and content preservation, and it shows superior performance in content structure integrity, stylization effect and computational efficiency.

Key words： Computer Vision Deep Learning Style Transfer Self-Attention Mechanism

收稿日期: 2025-12-04

ZTFLH:

TP391.4

基金资助:国家自然科学基金面上项目(No.52374155)、安徽理工大学研究生创新基金项目(No.2024cx2105)资助

通讯作者: 郑军帅,博士,讲师,主要研究方向为计算机视觉、深度伪造检测、对比学习.E-mail:zhengjs@aust.edu.cn.

作者简介: 葛斌,博士,教授,主要研究方向为计算机视觉、模式识别、信息安全.E-mail:bge@aust.edu.cn.
邵孜绎,硕士研究生,主要研究方向为计算机视觉、深度学习.E-mail:2023201180@aust.edu.cn.
夏晨星,博士,副教授,主要研究方向为计算机视觉、模式识别、机器学习.E-mail:cxxia@aust.edu.cn.
官骏鸣,博士,副教授,主要研究方向为计算机视觉、模式识别.E-mail:105051@hsu.edu.cn.
徐涛,硕士研究生,主要研究方向为计算机视觉、深度学习.E-mail:2023201180@aust.edu.cn.

引用本文:

葛斌, 邵孜绎, 郑军帅, 夏晨星, 官骏鸣, 徐涛. 频域增强自适应通道注意力与特征金字塔融合的可逆流网络风格迁移[J]. 模式识别与人工智能, 2026, 39(1): 83-96. GE Bin, SHAO Ziyi, ZHENG Junshuai, XIA Chenxing, GUAN Junming, XU Tao. Reversible Flow Network for Style Transfer Based on Frequency-Domain Enhanced Adaptive Channel Attention and Feature Pyramid Fusion. Pattern Recognition and Artificial Intelligence, 2026, 39(1): 83-96.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202601005 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2026/V39/I1/83

[1] WANG H, LI Y J, WANG Y H, et al. Collaborative Distillation for Ultra-Resolution Universal Style Transfer // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 1857-1866.
[2] GATYS L A, ECKER A S, BETHGE M.Image Style Transfer Using Convolutional Neural Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2414-2423.
[3] JOHNSON J, ALAHI A, LI F F.Perceptual Losses for Real-Time Style Transfer and Super-Resolution // Proc of the 14th European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 694-711.
[4] SIMONYAN K, ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science, 2014, 41(9): 103951-103958.
[5] 林星,陈昭炯,叶东毅.主体结构保持的增强型图像风格传递方法.模式识别与人工智能, 2018, 31(11): 997-1007.
(LIN X, CHEN Z J, YE D Y.Enhanced Image Style Transferring Method with Primary Structure Maintained. Pattern Recognition and Artificial Intelligence, 2018, 31(11): 997-1007.)
[6] HUANG X, BELONGIE S.Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 1510-1519.
[7] LI Y J, FANG C, YANG J M, et al.Universal Style Transfer via Feature Transforms // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge,USA: MIT Press, 2017: 385-395.
[8] PARK D Y, LEE K H.Arbitrary Style Transfer with Style-Attentional Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5873-5881.
[9] LIU S H, LIN T W, HE D L, et al. AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 6629-6638.
[10] PENG H Y, QIAN W H, CAO J D,et al. Arbitrary Style Transfer Based on Attention and Covariance-Matching. Computers & Gra-phics, 2023, 116: 298-307.
[11] YANG G M, ZHANG S C, FANG X J, et al. Pyramid Style-Attentional Network for Arbitrary Style Transfer. Multimedia Tools and Applications, 2024, 83(5): 13483-13502.
[12] AN J, XIONG H Y, HUAN J, et al. Ultrafast Photorealistic Style Transfer via Neural Architecture Search. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 10443-10450.
[13] WU Z J, ZHU Z, DU J P,et al. CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer // Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Sprin- ger, 2022: 189-206.
[14] YE Z X, HUANG H J, WANG X T, et al. StyleMaster: Stylize Your Video with Artistic Generation and Translation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2025: 2630-2640.
[15] BAI Z Y, XU H L, DING Q C, et al. MCAFNet: Multi-scale Channel Attention Fusion Network for Arbitrary Style Transfer. IEEE Transactions on Instrumentation and Measurement, 2025, 74. DOI: 10.1109/TIM.2025.3561400.
[16] MA P C, YANG X P, LI Y S, et al. SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models[C/OL].[2025-11-10]. https://arxiv.org/pdf/2508.03402v1.
[17] DINH L, KRUEGER D W, BENGIO Y.NICE: Non-linear Independent Components Estimation[C/OL]. [2025-11-10].https://arxiv.org/pdf/1410.8516.
[18] KINGMA D P, DHARIWAL P.Glow: Generative Flow with Invertible 1×1 Convolutions[C/OL]. [2025-11-10].https://arxiv.org/pdf/1807.03039.
[19] AN J, HUANG S Y, SONG Y B, et al. ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 862-871.
[20] LI H A, WANG L Y, LIU J.A Review of Deep Learning-Based Image Style Transfer Research. The Imaging Science Journal, 2025, 73(4): 504-526.
[21] SU N, WANG J T, PAN Y.Multi-scale Universal Style-Transfer Network Based on Diffusion Model. Algorithms, 2025, 18(8), DOI: 10.3390/a18080481.
[22] WEN L F, GAO C Y, ZOU C Q.CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 18300-18309.
[23] FENG W C, FENG W Q, HUANG D W, et al. StyleBrush: Style Extraction and Transfer from a Single Image. Computer Science, 2024, 51(8): 104862-104870.
[24] HU Z S, GE B, XIA C X, et al. Flow Style-Aware Network for Arbitrary Style Transfer. Computers & Graphics, 2024, 124. DOI: 10.1016/j.cag.2024.104098.
[25] QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: Frequency Cha-nnel Attention Networks // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 763-772.
[26] CHEN D D, YUAN L, LIAO J, et al. StyleBank: An Explicit Representation for Neural Image Style Transfer // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2770-2779.
[27] KWON J, KIM S, LIN Y W, et al. AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer.Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(12): 13310-13319.
[28] DONG Y S, LIU S T, LI Y X, et al. Aesthetic-Aware Adversarial Learning Network for Artistic Style Transfer. Neurocomputing, 2025, 646. DOI: 10.1016/j.neucom.2025.130431.
[29] XU R J, XI W J, WANG X D, et al. StyleSSP: Sampling StartPoint Enhancement for Training-Free Diffusion-Based Method for Style Transfer // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2025: 18260-18269.